English-Arabic Transliteration

نویسندگان

  • MOHAMED ABDEL FATTAH
  • FUJI REN
چکیده

Proper nouns may be considered as the most important query words in information retrieval. If the two languages use the same alphabet, the same proper nouns can be found in either language. However, if the two languages use different alphabets, the names must be transliterated. Short vowels are not usually marked on the Arabic words in almost all Arabic documents (except very important documents like the Muslim and Christian holy books). Moreover, most of Arabic words have a syllable of consonant-vowel (CV) which means that most of the Arabic words contain short or long vowel between two successive consonant letters. That makes it difficult to create EnglishArabic transliteration pairs since some English letters may not be matched with any Romanized Arabic letter. In the present study, we present different approaches for transliteration proper noun pair’s extraction from parallel corpora based on different similarity measures between the English and Romanized Arabic proper nouns under consideration. The strength of our new system is that it works well for low-frequency proper noun pairs. We evaluate the presented new approaches using two different EnglishArabic parallel corpora. Most of our results outperform previously published results in terms of precision, recall and FMeasure. Key-Words: Machine transliteration, Parallel corpora, Cross-language information retrieval.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Transliteration Experiments on Chinese and Arabic

We report the results of our transliteration experiments with language-specific adaptations in the context of two language pairs: English to Chinese, and Arabic to English. In particular, we investigate a syllable-based Pinyin intermediate representation for Chinese, and a letter mapping for Arabic.

متن کامل

Machine Transliteration of Names in Arabic Text under Consideration for Other Conferences (specify)? None Machine Transliteration of Names in Arabic Text

We present a transliteration algorithm based on sound and spelling mappings using nite state machines. The transliteration models can be trained on relatively small lists of names. We introduce a new spelling-based model that much more accurate than state-of-the-art phonetic-based models and can be trained on easier-to-obtain training data. We apply our transliteration algorithm to the translit...

متن کامل

Developing the Transliteration Interface for Arabic Text

In the Arabic-English and English-Arabic translation activities, the interface is very significant. For translation in the Arabic language, many issues need to be addressed. The existing systems have some problems and research has been initiated to improve. Transliteration is an important component of the translation. We in this study propose a system of interface for Arabic transliteration. Th...

متن کامل

Automatic Transliteration and Back-transliteration by Decision Tree Learning

Automatic transliteration and back-transliteration across languages with drastically different alphabets and phonemes inventories such as English/Korean, English/Japanese, English/Arabic, English/Chinese, etc, have practical importance in machine translation, crosslingual information retrieval, and automatic bilingual dictionary compilation, etc. In this paper, a bi-directional and to some exte...

متن کامل

Arabic to English Person Name Transliteration using Twitter

Social media outlets are providing new opportunities for harvesting valuable resources. We present a novel approach for mining data from Twitter for the purpose of building transliteration resources and systems. Such resources are crucial in translation and retrieval tasks. We demonstrate the benefits of the approach on Arabic to English transliteration. The contribution of this approach includ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007